A Preliminary Study on Automatic Breast Cancer Data Classification using Semi-supervised Fuzzy c-Means
نویسندگان
چکیده
Soria et al. have successfully identified six clinically useful and novel subgroups in the Nottingham Tenovus Breast Cancer dataset. However, the methodology used is semi-manual and no single clustering can automatically classify the dataset so far. In this work, two variations of semisupervised Fuzzy c-means (ssFCM) algorithms are explored to classify the Nottingham Tenovus Breast Cancer dataset into the same six subgroups. Three experiments were conducted using the two ssFCM algorithms and the results are evaluated by using inter-rater agreement measures. The ssFCM algorithms identified the six classes of breast cancer but, is in low agreement with Soria’s classification. This, together with high agreement using two clustering algorithms, suggest that the problem lie in the way we use ssFCM rather than in model correctness. Despite this, we consider the ssFCM results promising and note that work for further investigation in ssFCM is required.
منابع مشابه
A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised Fuzzy c-means
Previously, a semi-manual method was used to identify six novel and clinically useful classes in the Nottingham Tenovus Breast Cancer dataset. 663 out of 1076 patients were classified. The objectives of our work is three folds. Firstly, our primary objective is to use one single automatic method (post-initialisation) to reproduce the six classes for the 663 patients and to classify the remainin...
متن کاملInvestigating Distance Metrics in Semi-supervised Fuzzy c-Means for Breast Cancer Classification
In previous work, semi-supervised Fuzzy c-means (ssFCM) was used as an automatic classification technique to classify the Nottingham Tenovus Breast Cancer (NTBC) dataset as no method to do this currently exists. However, the results were poor when compared with semi-manual classification. It is known that the NTBC data is highly non-normal and it was suspected that this affected the poor result...
متن کاملSemi-Supervised Techniques in Breast Cancer Classification A Comparison between Transductive SVM and Semi-Supervised FCM
The Nottingham Tenovus Breast Cancer data has been successfully classified into six novel and clinically useful subgroups. But the existing technique used is semi manual. In this work, we use Transductive Support Vector Machine (TSVM) and semi-supervised Fuzzy c-means (ssFCM) as automatic techniques to classify the dataset and evaluate our results by using 10-fold Cross-Validation technique. A ...
متن کاملAutomatic Prostate Cancer Segmentation Using Kinetic Analysis in Dynamic Contrast-Enhanced MRI
Background: Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) provides functional information on the microcirculation in tissues by analyzing the enhancement kinetics which can be used as biomarkers for prostate lesions detection and characterization.Objective: The purpose of this study is to investigate spatiotemporal patterns of tumors by extracting semi-quantitative as well as w...
متن کاملAn investigation on scaling parameter and distance metrics in semi-supervised Fuzzy c-means
The scaling parameter α helps maintain a balance between supervised and unsupervised learning in semi-supervised Fuzzy c-Means (ssFCM). In this study, we investigated the effects of different α values, 0.1, 0.5, 1 and 10 in Pedrycz and Waletsky’s ssFCM with various amounts of labelled data, 10%, 20%, 30%, 40%, 50% and 60% and three distance metrics, Euclidean, Mahalanobis and kernel-based on th...
متن کامل